问题
I tried to implement a simple property-path tokenizer, so the result can be later calculated fast.
Here's my initial implementation:
function tokenize(path: string): (string | number)[] {
const res = [], reg = /\[\s*(\d+)|["']([^"']+)["']\s*]|[a-z_$0-9]+/gi;
let a;
while (a = reg.exec(path)) {
res.push(a[1] ? parseInt(a[1]) : a[3] || a[2] || a[0]);
}
return res;
}
It can take an input like this: first.a.b[123].c['prop1'].d["prop2"].last, and produce the following fast-resolution array:
['first', 'a', 'b', 123, 'c', 'prop1', 'd', 'prop2', 'last']
The problem I'm having is with adding support for nested quotes - ' and ", for an input like this: first["a\'b"].second['"'].
More precisely, I cannot figure out how to take one of the solutions here, and inject them into my regex. Those solutions work fine on their own, just not as part of my own regex, so joining the two expressions into one is the problem that I'm stuck with.
回答1:
Match and capture the character set quote. Then you can repeat any character but the captured quote with a negative lookahead inside a quantifier, then match the quote again.
If you need to handle backslashes before the same delimiter between the quotes, you can alternate with any escaped character before matching a non-delimiter. This will repeatedly match:
- Any escaped character, or
- Any character which is not the captured delimiter
(?:\\.|(?!\2).)*
function normalize(path) {
const res = [], reg = /\[\s*(\d+)(?=\s*\])|\[(["'])((?:\\.|(?!\2).)*)\2\]|[\w$]+/gi;
let a;
while (a = reg.exec(path)) {
res.push(a[1] ? parseInt(a[1]) : a[3] || a[0]);
}
return res;
}
console.log(normalize(`first.a.b[123].c['prop1'].d["prop2"].last`));
console.log(normalize(`first["a\'b"].second['"']`));
console.log(normalize(`["one\"two"]`));
console.log(normalize(`['one\'two']`));
But I'd suggest using a true parser like Acorn instead if at all possible.
来源:https://stackoverflow.com/questions/65730393/joining-two-regex-expressions-for-nested-quotes-support