Split string with a single occurence (not twice) of a delimiter in Javascript

前端未结

关注

 6  1429

星月不相逢

This is better explained with an example. I want to achieve an split like this:

two-separate-tokens-this--is--just--one--token-another

相关标签:

6条回答

太阳男子

2020-12-18 05:19

~~str.match(/(?!-)(.*?[^\-])(?=(?:-(?!-)|$))/g);~~

~~Check this fiddle.~~

~~Explanation:~~

Non-greedy pattern (?!-)(.*?[^\-]) match a string that does not start and does not end with dash character and pattern (?=(?:-(?!-)|$)) requires such match to be followed by single dash character or by end of line. Modifier /g forces function match to find all occurrences, not just a single (first) one.

Edit (based on OP's comment):

str.match(/(?:[^\-]|--)+/g);

Check this fiddle.

Explanation:

Pattern (?:[^\-]|--) will match non-dash character or double-dash string. Sign + says that such matching from the previous pattern should be multiplied as many times as can. Modifier /g forces function match to find all occurrences, not just a single (first) one.

Note:

Pattern /(?:[^-]|--)+/g works in Javascript as well, but JSLint requires to escape - inside of square brackets, otherwise it comes with error.

0 讨论(0)
发布评论:

提交评论
- 加载中...
Happy的楠姐

2020-12-18 05:23
@Ωmega has the right idea in using match instead of split, but his regex is more complicated than it needs to be. Try this one:
```
s.match(/[^-]+(?:--[^-]+)*/g);
```
It reads exactly the way you expect it to work: Consume one or more non-hyphens, and if you encounter a double hyphen, consume that and go on consuming non-hyphens. Repeat as necessary.

EDIT: Apparently the source string may contain runs of two or more consecutive hyphens, which should not be treated as delimiters. That can be handled by adding a + to the second hyphen:
```
s.match(/[^-]+(?:--+[^-]+)*/g);
```
You can also use a {min,max} quantifier:
```
s.match(/[^-]+(?:-{2,}[^-]+)*/g);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2020-12-18 05:27
You would need a negative lookbehind assertion as well as your negative lookahead:
```
(?<!-)-(?!-)
```
http://regexr.com?31qrn

Unfortunately the javascript regular expression parser does not support negative lookbehinds, I believe the only workaround is to inspect your results afterwards and remove any matches that would have failed the lookbehind assertion (or in this case, combine them back into a single match).
0 讨论(0)
发布评论:

提交评论
- 加载中...
孤独总比滥情好

2020-12-18 05:28
You can achieve this without negative lookbehind (as @jbabey mentioned these are not supported in JS) like that (inspired by this article):
```
\b-\b
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

日久生厌

2020-12-18 05:34

Given that the regular expressions weren't very good with edge cases (like 5 consecutive delimiters) and I had to deal with replacing the double delimiters with a single one (and then again it would get tricky because '----'.replace('--', '-') gives '---' rather than '--') I wrote a function that loops over the characters and does everything in one go (although I'm concerned that using the string accumulator can be slow :-s)

f = function(id, delim) {
    var result = [];
    var acc = '';
    var i = 0;
    while(i < id.length) {
        if (id[i] == delim) {
            if (id[i+1] == delim) {
                acc += delim;
                i++;
            } else {
                result.push(acc);
                acc = '';
            }
        } else {
            acc += id[i];
        }
        i++;
    }

    if (acc != '') {
        result.push(acc);
    }

    return result;
    }

and some tests:

> f('a-b--', '-')
["a", "b-"]
> f('a-b---', '-')
["a", "b-"]
> f('a-b---c', '-')
["a", "b-", "c"]
> f('a-b----c', '-')
["a", "b--c"]
> f('a-b----c-', '-')
["a", "b--c"]
> f('a-b----c-d', '-')
["a", "b--c", "d"]
> f('a-b-----c-d', '-')
["a", "b--", "c", "d"]

(If the last token is empty, it's meant to be skipped)

0 讨论(0)

猫巷女王i

2020-12-18 05:41
I don't know how to do it purely with the regex engine in JS. You could do it this way that is a little less involved than manually parsing:
```
var str = "two-separate-tokens-this--is--just--one--token-another";
str = str.replace(/--/g, "#!!#");
var split = str.split(/-/);
for (var i = 0; i < split.length; i++) {
    split[i] = split[i].replace(/#!!#/g, "--");
}
```
Working demo: http://jsfiddle.net/jfriend00/hAhAB/
0 讨论(0)
发布评论:

提交评论
- 加载中...