Regex word boundary alternative

两盒软妹~` 提交于 2019-12-07 03:54:24

问题


I was using the standard \b word boundary. However, it doesn't quite deal with the dot (.) character the way I want it to.

So the following regex:

\b(\w+)\b

will match cats and dogs in cats.dog if I have a string that says cats and dogs don't make cats.dogs.

I need a word boundary alternative that will match a whole word only if:

  1. it does not contain the dot(.) character
  2. it is encapsulated by at least one space( ) character on each side

Any ideas?!

P.S. I need this for PHP


回答1:


You could try using (?<=\s) before and (?=\s) after in place of the \b to ensure that there is a space before and after it, however you might want to also allow for the possibility of being at the start or end of the string with (?<=\s|^) and (?=\s|$)

This will automatically exclude "words" with a . in them, but it would also exclude a word at the end of a sentence since there is no space between it and the full stop.




回答2:


What you are trying to match can be done easily with array and string functions.

$parts = explode(' ', $str);
$res = array_filter($parts, function($e){
   return $e!=="" && strpos($e,".")===false;
});

I recommend this method as it saves time. Otherwise wasting few hours to find a good regex solution is quite unproductive.



来源:https://stackoverflow.com/questions/14074308/regex-word-boundary-alternative

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!