No \p{L} for JavaScript Regex ? Use Unicode in JS regex [duplicate]

本小妞迷上赌 提交于 2020-01-23 06:47:07

问题


I nedd to add a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ x time but I find this very ugly. So I try \p{L} but it does not working in JavaScript.

Any Idea ?

my actual regex : [a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ][a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ' ,"-]*[a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ'",]+

I want to have a thing like that : [\p{L}][\p{L}' ,"-]*[\p{L}'",]+ (or smaller than the actual expression)


回答1:


What you need to add is a subset of what you asked for. First you should define what set of characters you need. \pL means every letter from every language.

It's kind of ugly but doesn't affect performance and rather the best solution to get around such kind of problems in JS. ECMA2018 has a support for \pL but way far to be implemented by all major browsers.

If it's a personal taste, you could reduce this ugliness a bit:

var characterSet = 'a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ';
var re = new RegExp('[' + characterSet + ']' + '[' + characterSet + '\' ,"-]*' + '[' + characterSet + '\'",]+');

This update credits go to @Francesco:

var pCL = 'a-zA-ZáàâäãåçéèêëíìîïñóòôöõúùûüýÿæœÁÀÂÄÃÅÇÉÈÊËÍÌÎÏÑÓÒÔÖÕÚÙÛÜÝŸÆŒ';
var re = new RegExp(`[${pCL}][${pCL}' ,"-]*[${pCL}'",]+`);
console.log(re.source);



回答2:


You have XRegExp addon to support unicode letter matcher:

var unicodeWord = XRegExp("^\\pL+$"); // L: Letter

Here you can see more example matching unicode in javascript

http://xregexp.com/plugins/



来源:https://stackoverflow.com/questions/50178498/no-pl-for-javascript-regex-use-unicode-in-js-regex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!