I\'m writing a small PHP script to grab the latest half dozen Twitter status updates from a user feed and format them for display on a webpage. As part of this I need a rege
I have devised this: /(^|\s)#([[:alnum:]])+/gi
I found Gazlers answer to work, although the regex added a blank space at the beginning of the hashtag, so I removed the first part:
(^|\s)
This works perfectly for me now:
#(\w*[a-zA-Z_0-9]+\w*)
Example here: http://rubular.com/r/dS2QYZP45n
(^|\s)#(\w*[a-zA-Z_]+\w*)
PHP
$strTweet = preg_replace('/(^|\s)#(\w*[a-zA-Z_]+\w*)/', '\1#<a href="http://twitter.com/search?q=%23\2">\2</a>', $strTweet);
This regular expression says a # followed by 0 or more characters [a-zA-Z0-9_], followed by an alphabetic character or an underscore (1 or more), followed by 0 or more word characters.
http://rubular.com/r/opNX6qC4sG <- test it here.
It's actually better to search for characters that aren't allowed in a hashtag otherwise tags like "#Trentemøller" wont work.
The following works well for me...
preg_match('/([ ,.]+)/', $string, $matches);