Count sentences in string with JavaScript

允我心安 提交于 2020-12-10 07:22:19

问题


There are already a couple of similar questions:

  • Splitting textarea sentences into array and finding out which sentence changed on keyup()
  • JS RegEx to split text into sentences
  • Javascript RegExp for splitting text into sentences and keeping the delimiter
  • Split string into sentences in javascript

My situation is a bit different.

I need to count the number of sentences in a string.

The closest answer to what I need would be:

str.replace(/([.?!])\s*(?=[A-Z])/g, "$1|").split("|")

The only problem here is that this RegEx assumes a sentence starts with a capital letter, which may not always be the case.

To be more specific, I would define a sentence as:

  • Starting with a letter (capital or not), a number or even a symbol (such as $ or €).
  • Ending with a punctuation sign, such as a " . ", a " ? " or a " ! ".

However, if a sentence contains a number, which itself contains a " . " or a " , ", then the sentence should be considered as one sentence and not two.

Last but not least, we can assume that, except the first sentence, a sentence is preceded by a space.

Given a random string, how can I count the number of sentences it contains with Javascript (or CoffeeScript for that matter)?


回答1:


One regex to solve your problem is:

\w[.?!](\s|$)

The parts are as follows:

\w - Word character
\[.?!] - Punctuation as specified.
(\s|$) - Whitespace character OR the end of the string.

You may be able to use a character class instead of a group:

[\s|$]

For the final element, but that isn't working on https://regex101.com/.

Tested on the following:

Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.

And finds six sentences (bolded the end of sentences, not the actual match). Note that the different grouping might pose a problem if you're depending on it for any reason.




回答2:


This works if you have a single char at the end of a sentence in a string.

const text = ""; //insert your string here
const re = /[.!?]/;
const numOfSentences = text.split(re);
console.log(numOfSentences.length - 1);



回答3:


I figured out a much simpler solution.

let text = text + " ";
const count = text.split(". ").length - 1;
console.log(count);


来源:https://stackoverflow.com/questions/35215348/count-sentences-in-string-with-javascript

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!