Removing consecutive duplicate words in a string

前端未结

关注

 2  1969

终归单人心 2021-01-14 20:40

I am trying to write a function that removes consecutive duplicate words within a string. It\'s vital that one any matches found by the regular expression remains. In ot

2条回答

没有蜡笔的小新 (楼主)

2021-01-14 21:04
You may use a regex like \b(\S+)(?:\s+\1\b)+ and replace with $1:
```
$string=preg_replace('/\b(\S+)(?:\s+\1\b)+/i', '$1', $string);
```
See the regex demo

Details:
- \b(\S+) - Group 1 capturing one or more non-whitespace symbols that are preceded with a word boundary (maybe \b(\w+) would suit better here)
- (?:\s+\1\b)+ - 1 or more sequences of:
  - \s+ - 1 or more whitespaces
  - \1\b - a backreference to the value stored in Group 1 buffer (the value must be a whole word)
The replacement pattern is $1, the replacement backreference that refers to the value stored in Group 1 buffer.

Note that /i case insensitive modifier will make \1 case insensitive, and I have a dog Dog DOG will result in I have a dog.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...