I\'m trying to put together a regular expression for a JavaScript command that accurately counts the number of words in a textarea.
One solution I had found is as fo
This should do what you're after:
value.match(/\S+/g).length;
Rather than splitting the string, you're matching on any sequence of non-whitespace characters.
There's the added bonus of being easily able to extract each word if needed ;)
The correct regexp would be /s+/
in order to discard non-words:
'Lorem ipsum dolor , sit amet'.split(/\S+/g).length
7
'Lorem ipsum dolor , sit amet'.split(/\s+/g).length
6
Try to count anything that is not whitespace and with a word boundary:
value.split(/\b\S+\b/g).length
You could also try to use unicode ranges, but I am not sure if the following one is complete:
value.split(/[\u0080-\uFFFF\w]+/g).length
Try
value.match(/\w+/g).length;
This will match a string of characters that can be in a word. Whereas something like:
value.match(/\S+/g).length;
will result in an incorrect count if the user adds commas or other punctuation that is not followed by a space - or adds a comma with a space either side of it.
For me this gave the best results:
value.split(/\b\W+\b/).length
with
var words = value.split(/\b\W+\b/)
you get all words.
Explanation:
I recommend learning regular expressions. It's a great skill to have because they are so powerful. ;-)
you could extend/change you methods like this
document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\b\(.*?)\b/).length -1;
if you want to match things like email-addresses as well
and
document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.trim().split(/\s+/g).length -1;
also try using \s
as its the \w
for unicode
source:http://www.regular-expressions.info/charclass.html