How can I search some text for any and all hashtags (alphanumeric AND underscore AND hyphen) and wrap them in span tags eg search
some_string = \"this is so
This is the regular expression you want:
/(#[a-z0-9][a-z0-9\-_]*)/ig
The i
makes it case insensitive, which you already had. But the g
makes it look through the whole string ("g" stands for "global"). Without the g
, the matching stops at the first match.
This also includes a fix to remove the incorrect parenthesis and some unneeded backslashes.
Solution which works in multiline and non-latin symbols:
var getHashTags = function(string) {
var hashTags, i, len, word, words;
words = string.split(/[\s\r\n]+/);
hashTags = [];
for (i = 0, len = words.length; i < len; i++) {
word = words[i];
if (word.indexOf('#') === 0) {
hashTags.push(word);
}
}
return hashTags;
};
or in CoffeeScript:
getHashTags = (string) ->
words = string.split /[\s\r\n]+/
hashTags = []
hashTags.push word for word in words when word.indexOf('#') is 0
hashTags
Try this replace call:
EDIT: if you want to skip http://site.com/#tag
kind of strings then use:
var repl = some_string.replace(/(^|\W)(#[a-z\d][\w-]*)/ig, '$1<span>$2</span>');
If you don't want to match http://site/#hashs
, use this one instead*:
string.replace(/(^|\s)#[a-zA-Z0-9][\w-]*\b/g, "$1<span>$2</span>");
It will match:
#word
#word_1
and #word-1
#word
in #word?
or #word"
or #word.
or #word,
It won't match
"#word
nor ,#word
nor .#word
/#word
#_word
nor #-word
wor#d
The things you want and don't want to match may vary in different cases.
Try it yourself at regex101.
* The current accepted answer, posted by @anubhava, claims to skip url hash's but fails doing it.