Regex: String match including punctuation

十年热恋 提交于 2019-12-24 07:08:01

问题


From another question, I have this expression to match words in a sentence:

var sentence = "Exclamation! Question? Full stop. Ellipsis...";
console.log(sentence.toLowerCase().match(/\w+(?:'\w+)*/g));

It works perfectly. However, now I am looking for a way to match exclamation marks, question marks, and full stops separately. The result should look like this:

[
  "exclamation",
  "!",
  "question",
  "?",
  "full",
  "stop",
  ".",
  "ellipsis",
  "."
]

Only matching one dot from the ellipsis, not all three dots separately.

Any help would be greatly appreciated!


回答1:


How about using a word boundary to only return one dot from the ellipsis?

var sentence = "Exclamation! Question? Full stop. Ellipsis...";
console.log(sentence.toLowerCase().match(/[a-z]+(?:'[a-z]+)*|\b[!?.]/g));

Or a negative lookahead:

var sentence = "Exclamation! Question? Full stop. Ellipsis...";
console.log(sentence.toLowerCase().match(/[a-z]+(?:'[a-z]+)*|[!?.](?![!?.])/g));

After your commented scenario extension, a negative lookbehind seems to be effective.

var sentence = "You're \"Pregnant\"??? How'd This Happen?! The vasectomy YOUR 1 job. Let's \"talk this out\"...";
console.log(sentence.toLowerCase().match(/[a-z\d]+(?:'[a-z\d]+)*|(?<![!?.])[!?.]/g));



回答2:


Try Below Code

var sentence = "Exclamation! Question? Full stop. Ellipsis...";
console.log(sentence.toLowerCase().match(/[?!.]|\w+/g));

In case You want only one dot, you could use something like ---

var sentence = "Exclamation!!! Question??? Full stop. Ellipsis...";

var arr = sentence.toLowerCase().match(/[?]+|[!]+|[.]+|\w+/g);
arr = arr.map(function(item){
	return item.replace(/(.)\1+/g, "$1");
})

console.log(arr);


来源:https://stackoverflow.com/questions/51576619/regex-string-match-including-punctuation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!